Search CORE

69 research outputs found

Online Unsupervised Multi-view Feature Selection

Author: He Lifang
Lu Chun-Ta
Shao Weixiang
Wei Xiaokai
Yu Philip S.
Publication venue
Publication date: 27/09/2016
Field of study

In the era of big data, it is becoming common to have data with multiple modalities or coming from multiple sources, known as "multi-view data". Multi-view data are usually unlabeled and come from high-dimensional spaces (such as language vocabularies), unsupervised multi-view feature selection is crucial to many applications. However, it is nontrivial due to the following challenges. First, there are too many instances or the feature dimensionality is too large. Thus, the data may not fit in memory. How to select useful features with limited memory space? Second, how to select features from streaming data and handles the concept drift? Third, how to leverage the consistent and complementary information from different views to improve the feature selection in the situation when the data are too big or come in as streams? To the best of our knowledge, none of the previous works can solve all the challenges simultaneously. In this paper, we propose an Online unsupervised Multi-View Feature Selection, OMVFS, which deals with large-scale/streaming multi-view data in an online fashion. OMVFS embeds unsupervised feature selection into a clustering algorithm via NMF with sparse learning. It further incorporates the graph regularization to preserve the local structure information and help select discriminative features. Instead of storing all the historical data, OMVFS processes the multi-view data chunk by chunk and aggregates all the necessary information into several small matrices. By using the buffering technique, the proposed OMVFS can reduce the computational and storage cost while taking advantage of the structure information. Furthermore, OMVFS can capture the concept drifts in the data streams. Extensive experiments on four real-world datasets show the effectiveness and efficiency of the proposed OMVFS method. More importantly, OMVFS is about 100 times faster than the off-line methods

arXiv.org e-Print Archive

Crossref

MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks

Author: Defferrard Michaël
Kipf Thomas N
Li Jundong
Ma Yao
Shi Yu
Wei Xiaokai
Ying Rex
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/08/2019
Field of study

Graph embedding is an important approach for graph analysis tasks such as node classification and link prediction. The goal of graph embedding is to find a low dimensional representation of graph nodes that preserves the graph information. Recent methods like Graph Convolutional Network (GCN) try to consider node attributes (if available) besides node relations and learn node embeddings for unsupervised and semi-supervised tasks on graphs. On the other hand, multi-layer graph analysis has been received attention recently. However, the existing methods for multi-layer graph embedding cannot incorporate all available information (like node attributes). Moreover, most of them consider either type of nodes or type of edges, and they do not treat within and between layer edges differently. In this paper, we propose a method called MGCN that utilizes the GCN for multi-layer graphs. MGCN embeds nodes of multi-layer graphs using both within and between layers relations and nodes attributes. We evaluate our method on the semi-supervised node classification task. Experimental results demonstrate the superiority of the proposed method to other multi-layer and single-layer competitors and also show the positive effect of using cross-layer edges

arXiv.org e-Print Archive

Crossref

Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner

Author: Arnold Andrew
Chen Xinchi
Dong Rui
Huang Zhiheng
Ma Xiaofei
Ribeiro Danilo
Roth Dan
Wang Shen
Wei Xiaokai
Xu Peng
Zhu Henry
Publication venue
Publication date: 19/07/2022
Field of study

Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain and inspect a QA system's answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.Comment: published in NAACL 202

arXiv.org e-Print Archive

Cloning and characterization of a selenium-independent glutathione peroxidase (HC29) from adult Haemonchus contortus

Author: Altschul
Bagnall
Blaxter
Bradford
Callahan
Chiumiento
Chomczynski
Cookson
Cross
Dalton
Falquet
Flohé
Hafeman
Hartman
Henkle-Dührsen
Knox
Kuersten
Lall
Larkin
Liddell
Lixin Xu
LoVerde
McGuffin
Muleke
Newlands
Newton
Ruofeng Yan
Selkirk
Sies
Stadtman
Stover
Tamura
Tang
Tang
Thompson
Tripp
Wei Sun
Williams
Xiangrui Li
Xiaokai Song
Publication venue: The Korean Society of Veterinary Science
Publication date: 01/01/2012
Field of study

The complete coding sequence of Haemonchus (H.) contortus HC29 cDNA was generated by rapid amplification of cDNA ends in combination with PCR using primers targeting the 5'- and 3'-ends of the partial mRNA sequence. The cloned HC29 cDNA was shown to be 1,113 bp in size with an open reading frame of 507 bp, encoding a protein of 168 amino acid with a calculated molecular mass of 18.9 kDa. Amino acid sequence analysis revealed that the cloned HC29 cDNA contained the conserved catalytic triad and dimer interface of selenium-independent glutathione peroxidase (GPX). Alignment of the predicted amino acid sequences demonstrated that the protein shared 44.7~80.4% similarity with GPX homologues in the thioredoxin-like family. Phylogenetic analysis revealed close evolutionary proximity of the GPX sequence to the counterpart sequences. These results suggest that HC29 cDNA is a GPX, a member of the thioredoxin-like family. Alignment of the nucleic acid and amino acid sequences of HC29 with those of the reported selenium-independent GPX of H. contortus showed that HC29 contained different types of spliced leader sequences as well as dimer interface sites, although the active sites of both were identical. Enzymatic analysis of recombinant prokaryotic HC29 protein showed activity for the hydrolysis of H2O2. These findings indicate that HC29 is a selenium-independent GPX of H. contortus

Crossref

PubMed Central

The Euscaphis japonica genome and the evolution of malvids

Author: Chen De-Qiang
Din Le
Din Xiang-Qing
Huang Wei
Jiang Yu-Ting
Li Yifan
Li Zhen
Liao Xing-Yu
Liu Bobin
Liu Xue-Die
Liu Zhong-Jian
Ma Xiaokai
Ni Hui
Ni Lin
Qiu Meng-Yuan
Sun Wei-Hong
Van de Peer Yves
Wang Yifan
Wang Zhi-Wen
Wu Xi
Xiang Shuang
Xiao Lin
Yue Yi-Xun
Zhang Diyang
Zhang Pei-Lan
Zhang Qi-Gong
Zou Shuang-Quan
Zou Xiao-Xing
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Malvids is one of the largest clades of rosids, includes 58 families and exhibits remarkable morphological and ecological diversity. Here, we report a high-quality chromosome-level genome assembly for Euscaphis japonica, an early-diverging species within malvids. Genome-based phylogenetic analysis suggests that the unstable phylogenetic position of E. japonica may result from incomplete lineage sorting and hybridization event during the diversification of the ancestral population of malvids. Euscaphis japonica experienced two polyploidization events: the ancient whole genome triplication event shared with most eudicots (commonly known as the c event) and a more recent whole genome duplication event, unique to E. japonica. By resequencing 101 samples from 11 populations, we speculate that the temperature has led to the differentiation of the evergreen and deciduous of E. japonica and the completely different population histories of these two groups. In total, 1012 candidate positively selected genes in the evergreen were detected, some of which are involved in flower and fruit development. We found that reddening and dehiscence of the E. japonica pericarp and long fruit-hanging time promoted the reproduction of E. japonica populations, and revealed the expression patterns of genes related to fruit reddening, dehiscence and abscission. The key genes involved in pentacyclic triterpene synthesis in E. japonica were identified, and different expression patterns of these genes may contribute to pentacyclic triterpene diversification. Our work sheds light on the evolution of E. japonica and malvids, particularly on the diversification of E. japonica and the genetic basis for their fruit dehiscence and abscission.DATA AVAILABILITY STATEMENT : All sequences described in this manuscript have been submitted to the National Genomics Data Center (NGDC). The raw whole-genome data of E. japonica have been deposited in BioProject/GSA (https://bigd.big.ac.cn/gsa.) under the accession codes PRJCA005268/CRA004271, and the assembly and annotation data have been deposited at BioProject/GWH (https://bigd.big.ac.cn/gwh) under the accession codes PRJCA005268/GWHBCHS00000000. The raw transcriptomes data of E. japonica have been deposited in BioProject/GSA (https://bigd.big.ac.cn/gsa.) under the accession codes PRJCA005298/CRA004272.SUPPLEMENTARY MATERIAL 1: Supplementary Note 1. Chromosome number assessment. Supplementary Note 2. Whole-genome duplication identification and dating. Supplementary Note 3. Observation of E. japonica seed dispersal. Supplementary Note 4. Determination of pentacyclic triterpene substances. Figure S1. Cytogenetic analysis of E. japonica. Figure S2. Genome size and heterozygosity of E. japonica estimation using 17 k-mer distribution. Figure S3. Interchromosomal of Hi-C chromosome contact map of E. japonica genome. Figure S4. Gene structure prediction results of E. japonica and other species. Figure S5. Venn diagram shows gene families of malvids. Figure S6. Phylogenetic tree constructed by chloroplast genomes from 17 species. Figure S7. Concatenated- and ASTRAL-based phylogenetic trees. Figure S8. Ks distribution in E. japonica. Figure S9. Distributions of synonymous substitutions per synonymous site (Ks) of one-to-one orthologs identified between E. japonica and P. trichocarpa and V. vinifera. Figure S10. Population structure plot. Figure S11. Fixation index (FST) heat map among E. japonica populations. Figure S12. Phylogenetic analysis of MADS-box genes from O. sativa, A. thaliana, E. japonica, and T. cacao. Figure S13. Observation the fruit development. Figure S14. Animal seed dispersal. Figure S15. Anthocyanin biosynthesis in E. japonica fruits. Figure S16. Carotenoid accumulation and the chlorophyll degradation in E. japonica fruits. Figure S17. Expression profile of fruit dehiscence-related genes. Figure S18. Phylogenetic tree of DELLA genes obtained from six malvids species. Figure S19. Phylogenetic tree of CAD genes obtained from seven malvids species. Figure S20. Expression pattern of fruit abscission-related genes. Figure S21. Structure of pentacyclic triterpene compounds separated from Euscaphis. Figure S22. Phylogenetic tree of HMGR gene in plants. Figure S23. Phylogenetic tree of P450s gene family obtained from A. thaliana and E. japonica.SUPPLEMENTARY MATERIAL 2: Table S1. Assembled statistics of E. japonica genome. Table S2. Evaluation of E. japonica genome assembly. Table S3. Chromosome length of E. japonica. Table S4. Prediction of gene structures of the E. japonica genome. Table S5. Statistics on the function annotation of the E. japonica genome. Table S6. Non-coding RNA annotation results of E. japonica genome. Table S7. BUSCO assessment of the E. japonica annotated genome. Table S8. Statistic of repeat sequence in E. japonica genome. Table S9. Gene-clustering statistics for 17 species. Table S10. KEGG enrichment result of unique genes families of E. japonica. Table S11. Gene Ontology (GO) and KEGG enrichment result of significant shared by malvids species gene families. Table S12. Gene Ontology (GO) and KEGG enrichment result of significant expansion of E. japonica gene families. Table S13. Gene Ontology (GO) enrichment result of significant contraction of E. japonica gene families. Table S14. Statistical sampling population information. Table S15. Statistics population resequencing information. Table S16. Statistical nucleotide polymorphisms in the populations. Table S17. Candidate positive selection genes (PSGs) in the evergreen population. Table S18. Candidate positive selection genes (PSGs) in the deciduous population. Table S19. Gene Ontology (GO) enrichment result of significant PSGs in the evergreen population. Table S20. List of MADS-box genes identified in E. japonica. Table S21. Genes involved in anthocyanin biosynthesis, carotenoid biosynthesis, and chlorophyll degradation. Table S22. Identification fruit dehiscence-related genes in E. japonica. Table S23. Genes related to lignin synthesis that are highly expressed during pericarp dehiscence. Table S24. Gene expression levels (FPKMs) of fruit abscission-related genes in pericarp. Table S25. Triterpene compounds separated from Euscaphis. Table S26. Number of putative pentacyclic triterpene-related genes in the malvids species. Table S27. Identified pentacyclic triterpene synthesis-related genes in E. japonica genome. Table S28. Statistical simple sequence repeat.Fund for Excellent Doctoral Dissertation of Fujian Agriculture and Forestry University, China; Fujian Provincial Department of Science E. japonica Evolution and Selection of Ornamental Medicinal Resources, China; the Project of Forestry Peak Discipline at Fujian Agriculture and Forestry University, China; the Collection, Development and Utilization of Eascaphis konlshli Germplasm Resources; the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program and from Ghent University.https://onlinelibrary.wiley.com/journal/1365313xam2022BiochemistryGeneticsMicrobiology and Plant Patholog

Ghent University Academic Bibliography

PubMed Central

UPSpace at the University of Pretoria

Optimization of a New High Rotary Missile-Borne Stabilization Platform

Author: Xiaokai Wei
Publication venue: 'MDPI AG'
Publication date: 24/09/2019
Field of study

The passive semi-strapdown roll stabilized platform is an inertial platform, which can isolate the rolling of a projectile body by a special mechanical device. In the passive semi-strapdown roll stabilized platform, the bearing device plays an important role in isolating the rolling of the projectile body. The smaller the friction moment of bearing, the smaller the swing angular velocity of the platform, the smaller the range of inertial sensors required, the higher the accuracy of the navigation solution. In order to further reduce the swing angular velocity of the platform and improve the navigation accuracy, the bearing nested structure that could reduce the friction torque is proposed. Combined with the working principle of the passive semi-strapdown roll stabilized platform, the mechanical calculation model of friction at the moment of bearing the nested structure was established. A series of simulation analysis and tests showed that the output stability value of the friction moment was 47% that of a single bearing; the roll rate of the platform based on the bearing nested structure decreased to 50% of that based on the single bearing structure; the position and attitude errors measured of the platform based on the bearing nested structure decreased to more than 50% of that based on the single bearing structure. It showed that the bearing nested structure could effectively reduce the friction moment, improve the axial reliability of the bearing, and provide a more stable working environment for the passive semi-strapdown roll stabilized platform

Multidisciplinary Digital Publishing Institute